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Abstract 

Motivated by the problem of sorting, we introduce two simple com- 
binatorial models with distinct Hamiltonians yet identical spectra (and 
hence partition function) and show that the local dynamics of these mod- 
els are very different. After a deep quench, one model slowly relaxes to 
the sorted state whereas the other model becomes blocked by the presence 
of stable local minima. 



1 Introduction 

Viewing optimization problems that arise in computer science from the perspec- 
tive of statistical mechanics has led to successful insights pQ. From this point 
of view, where optimization algorithms are intimately related to the energy or 
cost function, the features of the energy landscape are crucial in determining the 
success or otherwise of optimization. This connection is less evident in computer 
science since algorithms do not necessarily have such a direct connection with 
the cost function. One of the open questions of the field is to understand the ex- 
tent to which the performance of algorithms can be determined by the character 
of the optimization problem and not the details of the algorithm itself 

In this paper we hope to illuminate the issue by introducing a pair of models 
with the unusual feature that although they have the same static properties 
determined by the list of energy levels, the energy landscape is different since 
the notion of which states are close to each other is not the same. We per- 
form dynamical simulations of relaxation after a quench to explore the energy 
landscape for each of the models following a similar strategy to that used to 
investigate other computer science problems such as satisfiability (K-SAT) and 
graph colouring -Ji). 

One of the best known optimization models is the Traveling Salesman Prob- 
lem (TSP) 0]. States can be labeled by permutations since the paths in the 



1 



TSP correspond to various orders of the cities visited. The combinatorial mod- 
els we consider in this paper have states which are permutations of individual 
numbers rather than permutations of more general objects such as the set of 
city coordinates used in the TSP. In this case the natural optimization problem 
is sorting the numbers and we shall take that problem as sufficient motivation. 
The problem of sorting has been extensively treated in the computer science 
literature |S] and many efficient algorithms are known. Most of this work is 
concerned with analyzing the time taken to perform the sort, though issues 
such a memory requirement and ability to use cache are also important. 

For some optimization problems, such as TSP, the cost function is clearly 
the length of the path and this can be taken as the Hamiltonian of the statistical 
mechanical model. The problem of sorting does not have an uniquely obvious 
cost function. We require that the lowest energy corresponds to the sorted state 
but have freedom to decide how to measure the degree other permutations are 
sorted. In this paper we consider two different energy functions that compute 
the degree of sortedness in different ways: one is similar to the TSP and measures 
the cost in terms of the length of a path, the other assumes knowledge of the 
sorted state and computes the distance from that state in a direct manner. 

By investigating these two models we discover that the spectrum of energies 
is identical. This implies that the partition functions are also identical and all 
static properties will be the same. Yet the Hamiltonians are different, and the 
energy each Hamiltonian assigns to a given state is not the same. We make 
some preliminary observations on the mapping between states with a given 
energy according to one Hamiltonian and the same energy according to the other 
Hamiltonian, but do not delve into mathematical details. Of more concern to 
this presentation is the fact that although no physical distinction between the 
two models is visible at the static level, it is manifest in the dynamics. We 
investigate the local dynamics after a deep quench and show how it displays 
very different behavior for each model: in one case the (sorted) ground state 
is eventually found, in the other case it is not. The difference in behavior is 
identified as being due to rather different energy landscapes, with stable local 
minima in one Hamiltonian but not the other. 

From a physical point of view, study of the dynamical relaxation after a 
quench falls under the topic of phase ordering kinetics jS] . From the perspective 
of computer science, it is the natural way to study the efficiency of a local search 
algorithm in finding the optimum solution to a problem. 

2 Models 
2.1 State Space 

For statistical mechanical approaches to sorting, the states or configurations are 
the permutations of a set of numbers. In the following, we shall consider the 
set of N integers 1,2 ... N. A more general model, in which the numbers are 
taken to have randomly chosen real values, is briefly described in appendix lAl 
Most of the properties investigated in this paper hold for both models, but for 
the sake of clarity we shall work exclusively with the simpler integer model. 

The states correspond to all possible permutations P of the set of N integers. 
The permutation label can be regarded either passively as an ordered iV-tuple 
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or actively as the mapping needed to get to that iV-tuple from the identity [7|. 
Generally we shall employ the first interpretation, so Pj signifies the i'th ele- 
ment of the permutation, (Pi, P2 . . . P/v), but the second interpretation will be 
convenient later when we use it to write the permutation in terms of cycles. We 
imagine the Pi as analogs of spins on a line, and sometimes we will refer to this 
line as a one dimensional spatial direction. 

The size of the state space grows as Nl in comparison with typical (Ising) 
spin models where the space grows as 2 N . It is well-known that models such 
as this, where the state space grows faster than exponentially, have difficulties 
of interpretation since scaling of the temperature or energy with N is neces- 
sary to ensure that certain quantities are extensive. In this work, we do not 
investigate the phase structure, and never need to perform this scaling since we 
only consider zero (or infinite) temperature. We refer the reader to Mezard and 
Parisi |HJ and to Anderson and Fu 9 for further discussion of this matter. 

2.2 Two Energy Functions 

We consider two energy or cost functions that introduce a measure of the dis- 
tance of a permutation away from the identity permutation. The energy is 
lowest and vanishes when evaluated for the ordered or identity permutation. In 
writing these expressions it is convenient to introduce the analog of the Kro- 
necker delta overlap between individual spins as the one dimensional distance 
metric between the numbers P^: 

d(P i ,P j ) = \P i -P j \ (1) 

The first energy is familiar as the cost function for the TSP 0], here evalu- 
ated for the one-dimensional case. The boundary conditions are slightly different 
from the usual TSP since we insist they be fixed rather than periodic. The con- 
nection with sorting is clearer with fixed boundary conditions and we need not 
consider the degeneracy of all states under cyclic shifts or inversions. However, 
it should be noted that the bulk part of the energy minimizes to either ascending 
or descending order, and it is only the boundary conditions that select ascending 
order. This energy, which henceforth we shall call the TSP energy (Etsp), is: 



Etsp(P) = ^X>(P+i,P) 



(N + l) 



IN ^ v ' 2N 

i=Q 

N-l 



1 1 N ~ 1 

= ^(Pi-p^ + ^Y,^-^ ( 2 ) 

8=1 

The first form is written in the standard form for TSP and we have addi- 
tionally assumed: Po = 0, Pn+i = N + 1 for any P. In the second form the 
constant term is removed, by writing the end terms of the sum explicitly. In- 
stead of using the function Q , powers (notably quadratic) of the node position 
differences could be considered, but these energy functions do not appear to 
be natural in this context and do not obey the properties we will demonstrate 
below. 
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Figure 1: Geometrical representations for an N = 7 configuration: the permu- 
tation 1364725. In the top diagram, the TSP representation is shown. Note the 
dependence on boundary conditions and that there is a continuous path from 
one wall to the other. The second diagram interprets the displacement energy as 
bipartite matching: each line could be regarded as a spring pulling the relevant 
point to its sorted location. The final diagram is a different representation of 
the displacement energy as an assignment problem using just one set of nodes. 
In this representation, the permutation splits up into several separate pieces 
corresponding to its disjoint cycles. 



The energy function for our second model, which we shall term the displace- 
ment energy due to its connection with the "total displacement" of R.W. FlovdfTT)]. 
is given as a sum of site distances from the sorted configuration P^ ' = i. 

N N 

E ^ p ) = 2N £ d{P ? > Pl) = m^ \ p < - <l ( 3 ) 

i=l i=l 

This form of cost function is familiar as an "assignment problem" [H] . 

The choice of distance measure relying on site differences is by no means 
unique. Many other ways of defining the distance between configurations are 
possible. For example, in the physics literature, an overlap based on counting 
similar links is common . Another overlap (that happens to yield a tractable 
model [12] ) is based on the matching problem and is determined by counting 
the number of sites that are in their correct relative positions. 

The different cost functions can be interpreted geometrically as illustrated 
in figure m For the permutation, 1364725 with N = 7, used as the example in 
the figure, the energies may be computed as: 

Etsp = ^(6 + 2 + 3 + 5 + 6)-^(8) = l 

Ed = ^ F (l + 3 + 4 + 2 + 2) = 6/7 (4) 
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These expressions and the diagrams in the hgure make it clear that the 
energy may be computed as a sum over runs (either ascending or descending) 
in the case of the TSP energy, and as a sum over cycles in the case of the 
displacement energy. A run is the term used in the combinatorial literature 
for a subsequence of adjacent elements that are in sorted (or antisorted) order. 
For the TSP energy each contribution is the difference between the maximum 
and minimum value contained in the run. However, the contribution of each 
cycle to the displacement energy is not usually a single term (for more complex 
permutations than that shown in the figure) and it is necessary to look at runs 
within a cycle. 

3 Relation between Models 

The energy spectra of Etsp an d Ed are identical. That is, there is a one-one 
map between the full list of energies for all states computed with Ed and the 
list computed with Etsp- Of course there is a shuffling in the way the states 
are associated with energies. 

For example, figure shows the spectrum of the N — 4 model, with states 
labeled for each energy function. For many, but not all of these N — 4 states, 
the energy is the same whether evaluated with the TSP or the displacement 
energy. The proportion of states with invariant energy decreases at larger N. 
There are many degeneracies between energy levels: in the figure there are a 
9-plet, a 7-plet, a 4-plet and a triplet besides the singlet ground state (which 
is always unique) that together make the 4! = 24 states. In appendix [5] wc 
list the multiplicity structure for small values of N. This information might be 
expected to lead to an expression for the partition function. However, we have 
been unable to obtain a general formula for these multiplicities and according 
to Knuth[H], the generating function does not appear to have a simple form. 

3.1 Mapping between Models 

To demonstrate the relation between the two models formally, we present a 
mapping between the states P — ► P' , (in fact a permutation in the state space) 
that has the property that Er>(P) — Etsp(P')- 

The mapping relies on the representation of the state P in terms of a permu- 
tation of the ordered state (1,2,3. . .N) using the cyclic representation. This 
mapping is well-known and is for example described in Knuth |13|. 

We write any permutation in terms of M distinct cycles (singleton cycles 
included explicitly). Each cycle is labeled by a superscript j and contains rrij 
elements in the cycle as: 

(444 ■ . . O mi • • • CJ • • • (444 • • • ^ ) • • • (4% M *f ■ • • C) ( 5 ) 

We fix the freedom available in the way the cycle form is written by requiring 
that: 

• The largest element appears at the start of each cycle: i{ > ii for all other 
k in the cycle k = 2, 3 . . . rrij. 

• Cycles appear in the order determined by their first elements: > i\ 
for j = 1,2... (M - 1). 
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Figure 2: Spectrum for N = A. The multiplets and their energies are shown 
in the centre the left column shows the states according to the TSP energy and 
the right column shows them according to the displacement energy. States that 
are marked with a "*" take different energies depending on the energy function 
used. 
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Note that this is slightly different from the convention defined by Knuth. 

The action of this permutation on the ordered state (1,2,3... N) is to take 
the value i k to i k , 1 (or cyclically). The resulting permutation is the original 
state P of the map. The state P' obtained from the map is given simply by 
removing brackets from the definition of the element © above and regarding 
the list of numbers as a permutation. 

For example with N = 4, the cyclic form (1)(423) takes 1234 to 1342. Thus 
P = 1342 and P' = 1423. In this case, both P and P' lie in the same multiplet 
according to either the TSP or displacement energy. This is not the case for P 
= 3412 and P' — 3142 that appear at the top of figureEland correspond to the 
cyclic form (31) (42). Nonetheless, £ D (3412) = S T5P (3142). 

The displacement energy in the state P is given by a sum over cycles as: 

1 M 

E ° = 2N £ + d{ &> ^ + ■ • ■ + WU-A) + d Wn 3 , (6) 

3=1 

The TSP energy can be evaluated in the state P' to obtain three terms, one 
from the elements at the end of the original sum in J2J , one from the contribution 
of each cycle, and a term corresponding to the intercycle contributions. 

1 1 M_1 

Etsp = ^-O+j E^.i,) 

3=1 

M 

+ -Y,{d{i{A)+diiiA) + --- + d{iL j _ x ,iL j )) (7) 

3=1 

The cycle sums appearing in © and the last term in are identical except 
for one additional term in the first equation. The total difference between the 
energies is given by: 

M-l M 

Etsp ~E D = —{ii- Cj + \^ ~ l k I " 5 E & ~ ^ ' ( 8 ) 

3=1 3=1 

Now, by using the requirements on ordering of the i^'s stated above, we find 
that the modulus signs may be removed in each sum and the total vanishes. 
This does not occur for other choices of distance function, but does continue to 
hold for the model based on real numbers rather than integers. 

The map is invertible due to the requirements listed above. Note that the 
map must leave the ground state unchanged. However, the multiplet structure 
is not preserved, states that appear in a certain multiplet according to one 
energy function may appear in a different multiplet according to the other energy 
function. This can be seen in the case of N = 4 as the states marked with a 
V in figure 

Several distinct versions of the map exist, based on different conventions for 
the way the cycle form is written. An approach to understanding the symmetries 
of the system would be to combine a map and the inverse of a different version. 
We do not consider this approach here since it takes us too far afield from the 
aim of the present paper. 
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3.2 Average Energy 



Here we consider statistics of the energy (either TSP or displacement) spectrum: 
the average energy and its fluctuation. These are averages from a combinatorial 
point of view in which all states contribute equally; thermodynamically they 
arc effectively at infinite temperature. 

An estimate of the large N behavior is most easily obtained by writing the 
TSP energy in terms of (A), the average distance between nodes: 

{Etsp(N)) = ±J2(\P t+1 - Pl \) « 1(A) = ^ (9) 

i=0 

Where the approximation consists in ignoring correlations between the dis- 
tances between different pairs of nodes. In this case, the pairs can be imagined as 
two independent points thrown at random in the interval [0, N], so the average 
distance between them is (A) = N/3. 

A more formal computation based on the displacement energy proceeds as 
follows, where J2p indicates a sum over all permutations. 

P i=l 
N 



2NN\ 

- StEN 

»=1 k=l 

The order of summation is exchanged in the second line after which the sum over 
permutations becomes simple since there are (N — 1)! permutations in which Pj 
takes a given value k. 

The resulting sum can be performed using standard techniques and a similar 
argument holds for the second moment. 

N 2 - 1 

(E(N)) = 



QN 

~180N 2 



«**)-(*»•> - (Ar+ ?»!r +7) (id 



The width of the energy distribution, \J {(E — (E)) 2 ), scales as N 1 / 2 and 
therefore becomes relatively more peaked at large N. 



4 Dynamics 

Although the energy spectra of the two models is identical and the partition 
functions are the same, the models still have distinct properties. This would 
be apparent by studying the response to some external field that couples to 
states in the same way in each model. The displacement energy itself could 
be regarded as an example of this kind of additional term in the cost function. 
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However, in the context of this paper and the problem of sorting, the distinction 
is best studied by looking at the dynamical properties of the models. 

We only consider local dynamics: that is the basic moves are adjacent trans- 
positions. Of course there is no physical basis to these models requiring locality, 
and in real sorting algorithms non-locality of the elementary moves is essential 
to obtain efficient sorting. Furthermore, we only consider a dynamics associated 
with the energy function - namely the Monte Carlo Metropolis algorithm. This 
is natural from a theoretical physics perspective (even though it does not corre- 
spond to any physical dynamics HU), but algorithms that have much less clear 
relationship with the cost function are common in computer science. Indeed we 
could imagine a reasonable algorithm that selects random sites and transposes 
with the neighbor if they are out of order. 

The usual approach physicists have used to investigate computer science 
problems is to study the dynamical relaxation under local search algorithms [Hj • 
From the physical point of view, the approach consists in studying the phase 
ordering kinetics after a deep quench 0. 
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MC steps 

Figure 3: Dynamical evolution of the energy according to zero temperature 
Monte Carlo using local moves. The curve that reaches a plateau is for the TSP 
energy and the other is for the displacement energy. Size N = 100. An average 
over 1000 different initial states is taken with error bars that are too small to 
be shown. 



4.1 Zero Temperature Metropolis Algorithm 

The Monte Carlo moves are local adjacent transpositions and at zero tempera- 
ture this effectively constitutes a randomized steepest descent algorithm. The 
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Metropolis algorithm we shall use selects a trial site at random and transposes 
with its (right hand) neighbor provided this move reduces the energy (or does 
not change it). We perform numerical simulations starting from a random con- 
figuration which typically has energy very close to the average energy computed 
in section l3~2l 

Figure [3] shows the results of numerical studies of dynamics according to 
this algorithm. The two curves in the figure correspond to dynamics based on 
each of the energy functions we have defined. Starting from a randomly chosen 
initial state, the plot follows the evolution of the energy averaged over many 
choices of this initial state. This situation corresponds to a quench from very 
high temperature to zero temperature. 

The model based on the displacement energy evolves in the expected man- 
ner: the energy slowly reduces and eventually reaches the sorted ground state. 
The other model, based on the TSP energy starts with a similar initial energy 
then rapidly reduces to a plateau value at which level it continues indefinitely. 
No further decrease in energy is evident, even for much longer runs, and the 
energy never arrives at the sorted ground state. This behavior can be improved 
somewhat by using simulated annealing but it remains extremely slow and still 
tends to get stuck. 




1 2 3 4 5 6 7 

Time (MC steps/N A 2) 

Figure 4: Scaling of the dynamical evolution of the displacement energy ac- 
cording to zero temperature Monte Carlo using local moves. Six size systems 
from N = 100 to N — 3200 are shown with axes scaled to demonstrate data 
collapse. The energy axis is scaled by 1/N and is plotted logarithmically. The 
time axis is scaled by 1/N 2 . An average over 1000 different initial states is taken 
with errors that are too small to see on the figure. 
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4.2 Timescales for Energy Decay 

According to figure the displacement energy appears to reduce at an expo- 
nential rate. The log-log plot (not shown) is only able to substantiate this for 
the first part of the decay. The quality of this initial exponential decay, and the 
form of the subsequent deviations from this form are shown in figure^ 

Within the initial exponential decay region it is possible to measure the iV 
dependence of the exponential timescale. At short times this characteristic time 
scales as iV 2 (fits using times up to order iV 2 give the exponent with accuracy 
~ 0.1%, and value tending to 2.0 as fewer points are taken), so the time axis 
of the figure has been scaled as t/N 2 in order to collapse the data in the initial 
region. 

The deviation from exponential form and absence of data collapse in the 
later part of the data is a finite size effect. With the help of small quantities 
of data for very large sizes up to N — 10 5 , careful measurements of the time 
required to reach fixed values of E/N can be made. The way these times scale 
with N shows a consistent trend towards an exponent of 2.0 at larger N. 
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Figure 5: Scaling of the dynamical evolution of the TSP energy according to 
zero temperature Monte Carlo using local moves. Three size systems, N = 100, 
N = 400, and N — 1600 are shown with axes scaled to demonstrate data 
collapse. The energy axis is scaled by 1/N. The time axis is scaled by 1/N. An 
average over 1000 different initial states is taken with errorbars that are the size 
of the marks. 

Certainly for any size that can be simulated, the total time to arrive at zero 
energy is affected by the finite size effects and has a scaling exponent larger 
than 2. Not surprisingly, the resulting sort is rather slower than achieved with 
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standard sorting algorithms that have best case behavior increasing as NlogN. 

For the TSP energy, figure [5] shows a scaled plot for different size systems. 
In this case the time is only scaled by a factor of 1/N and the energy axis is not 
logarithmic. The stability of the plateau energy is very clear in this figure. A 
detailed investigation finds that after the effect of finite size effects are removed, 
the per site energy on the plateau is E T sp/N = 0.1092 ± 0.0001. 

The fact that the evolution timescales of the TSP and displacement models 
are respectively N and ~ N 2 indicates another surprising distinction between 
the dynamics of the two models. 
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Figure 6: The spatial correlator for configurations chosen in the plateau region 
(after 20000 steps on figure though the form is the same for all configurations 
on the plateau) of the evolution of the TSP energy. Size N — 100 averaged over 
1000 different initial states. 



4.3 Spatial Correlations 

Throughout this work we have implicitly considered the indices to form a line. 
In this section we consider correlations along this line and refer to such correla- 
tions as spatial. Studying the evolution of these correlators may illuminate the 
dynamical properties of the model since dynamical lcngthscales may appear. 
By analogy with the definition of the distance metric , we define the spatial 
correlator at distance r as: 

N—r 

C ^ = N(N^)^ ]Pi+r ~ Pi] (12) 
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Up to terms relating to boundary conditions, C(l) is nothing other than the 
TSP energy. A small value of C signifies a strong correlation, and in the fully 
sorted state it grows linearly C(r) = r/N. From arguments similar to those 
used to derive the average energy, it can be shown that the correlator takes a 
constant value of 1 /3 when averaged over randomly chosen configurations (more 
precisely, (C(r)) = (N + l)/3N;r > 0). In the figures to follow, we only show 
C(r) up to values of r < N/2. This is because for larger r, only a small number 
of pairs appear in the sum l|12|) and the indices of these pairs are often near the 
boundaries, thus making this region excessively dependent on boundary effects. 

As the configuration evolves according to the TSP model, the spatial cor- 
relator remains similar to its initial constant value 1/3 corresponding to the 
initial random configuration, but some structure develops at small r. Once the 
plateau is reached, there is no further change in the spatial correlator and figure 
El shows its final form. The non-trivial structure has a clear size, of less than 10 
units, and neither the size scale nor the shape of the structure depend on N . 
This scale indicates the distance over which ordering can take place according 
to the TSP dynamical process. The mechanism that blocks further growth of 
the order beyond this scale is discussed below. 
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Figure 7: The evolution of the spatial correlator for the displacement energy. 
Size N — 100 averaged over 1000 different initial states. In order of increasing 
slope the lines are for configurations taken every 10000 Monte Carlo steps after 
the start of the simulation. There is no change after 40000 steps. 

In the case of the displacement energy, the correlator evolves as shown in 
figure Here, no detailed structure ever appears, and the correlator is always 
a straight line with a gradient that smoothly evolves towards the steepest slope 
corresponding to the final sorted ground state. There is no characteristic length- 
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scale over which ordering takes place and then grows. It would rather appear 
that the system becomes organized on all scales simultaneously. 



4.4 Time Correlations 

In order to investigate correlations between configurations at different times, we 
use the same overlap that was employed for the displacement energy. 



1 N 

c(t,t , ) =—Y / \p(^-p(t') 



(13) 



Other possibilities are certainly possible. For example, an overlap based on 
matching counts the number of positions where the number has not changed. 
The graphs based on this choice do not convey any significantly different infor- 
mation from those given below. 
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Figure 8: Two-time correlators C(t, t w ) shown for different waiting times for the 
evolution according to TSP energy. Size N = 100 averaged over 1000 different 
initial states. Waiting time is zero for the top curve and increases by 200 in 
each lower curve until the bottom curve which is the same for any waiting time 
greater than 1000. This lowest curve indicates that aging does not occur. 

The simplest correlation to measure is against the completely sorted ground 
state. This is none other than the displacement energy itself, and was shown 
(for dynamics based on this energy) in figure 

Another familiar comparison is against the initial state. For the dynam- 
ics according to the displacement energy, the correlation with the initial state 
disappears rapidly (within about 20N steps), and we do not show a figure in 
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this case. For the dynamics according to the TSP energy, we show in figure [S] 
the two-time correlators for different waiting time plotted against log(i — t w ) in 
the conventional manner. For t w = the correlation against the initial state is 
included in this figure. We have drawn two-time correlators not because aging 
occurs in this model - it does not, but as a convenient way of demonstrating two 
features: that the configurations on the plateau retain some correlation with the 
original random state, and to show that evolution is not frozen on the plateau. 
With this aim, the waiting times shown in figure [S] are quite short. 

The final value of the t w = curve (about 0.092) is much less than that 
associated with correlation between random configurations (1/6 in this case due 
to the factor of 2 in the definition indicating that not all information in 

the original configuration has been lost by the time the plateau is reached. This 
agrees with the result of the spatial correlation that indicates that only local 
modification takes place. 

The lowest two-time curve holding for all t w > 1000 (for size N = 100) 
corresponds to waiting times that have reached the plateau. This curve is not 
constantly zero as would be the case if there was no dynamics happening on 
the plateau. The configuration continues to evolve, though the energy does not 
change. However, the correlation is bounded: irrespective of how long after 
the waiting time, the correlator never rises above a certain value (about 0.021). 
This limiting value is independent of N. The reason for this behaviour is the 
limited size of the flat directions that are identified below. 




Figure 9: Local minima for N = 4. The four states in the centre have TSP 
energy Etsp — 3 while those on the periphery have Etsp — 4. Local trans- 
positions allow three possible moves between states and these are all shown for 
the local minima quartet. 



15 



4.5 Local Energy Minima 

The fact that a simple algorithm such as zero temperature Monte Carlo is 
unable to find the ground state is hardly a surprise in optimization models. 
There are many statistical mechanical examples of this state of affairs, and the 
effect is usually ascribed to the features of the free energy landscape. In the 
K-Satisfiability problem the difficulty of finding the ground state via a local 
search procedure is due to the proliferation of states which trap the search into 
metastable phase. Eventually, a large fluctuation provides a means of reaching 
the ground state ^JE]- However, in our TSP model the reason is more prosaic 
and the origin of the effect is due to the presence of stable local energy minima. 
In this respect, it is similar to the XOR-SAT model, that also suffers from such 
minima 

Though the two energy functions have matching energy levels they have 
very different energy landscape characteristics. The displacement version has 
no stable local minima, but the TSP version does. Moreover, these minima 
appear to proliferate as N increases and are never avoided. 

For small values of N the local minima of the TSP model may be found 
explicitly No such minima exist for N = 3. For N = 4 there are four minima 
connected by Monte Carlo moves as shown in figure El Note that these minima 
are precisely the states that appear with a "*" in figureEl and indeed the energy 
assignments of figure [5] are inverted for the displacement energy, so in that case 
there are no local minima. 




Figure 10: A schematic representation of the trapping configuration found on 
the plateau. The turning pairs are shown enclosed. This configuration could 
represent the N = 20 configuration: 5 9 17 14 11 6_8 20 16 15 7 4 1 3 10 12 
18 19 13 2. 

For larger N, the trapping configurations appear as shown in figure HHI They 
consist of alternating ascending and descending runs separated by turning pairs. 
For an upper turning pair, each element of the pair is greater than either of the 
elements neighbouring the pair, with a similar definition for a lower turning 
pair. These trapping configurations found on the plateau, are slightly more 
ordered than random configurations that consist of ascending and descending 
runs separated by ordinary turning points. A turning point is a requirement on 
a subsequence of length 3, whereas a tuning pair is a requirement on a longer 
subsequence of length 4. The dynamical Monte Carlo process performs this 
small scale ordering (as observed in the spatial correlators) to arrive at the 
plateau configurations. 

The trapping configurations are stable local minima: flat directions corre- 
spond to the moves that interchange the two elements of the turning pairs, and 
all other transpositions raise the energy. The number of states in the trap is 
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therefore 2 number °f turnm 9 pairs _ g j nce ^h.e num ber of turning pairs grows lin- 
early with the size of the system, the size of the trap grows exponentially with 
AT. In appendix [CI an estimate for the number of turning points (same as the 
number of turning pairs) for plateau configurations is derived, so the exponen- 
tial growth is approximately 2 Ar / 3 . Of course, many different traps exist, each 
with this typical size. 

The picture of the trapping configurations in figure 1101 makes the turning 
pairs appear like domain walls. This is a reasonable interpretation since the 
bulk part of the TSP energy has two different minima with ascending and de- 
scending order and it is these phases that are separated by the turning pairs. The 
boundary conditions raise the degeneracy of the sorted and antisorted phases, 
but this effect never comes into play here since the domain walls are frozen and 
do not move after the initial relaxation. 

The local minima provide a basis for attempting to understand the value of 
the plateau energy observed under the dynamics above. A rather naive argument 
given in appendix El based on characterizing the typical length of ascending or 
descending runs in the permutation defining the local minimum state gives the 
per site value of 1/10. This should be compared with the numerical value of 
Etsp/N = 0.1092 ± 0.0001 found in section IO 




0.4 0.6 
Interpolation parameter p 



Figure 11: The final (plateau) energy of the interpolating Hamiltonian. For 
the ordered model, size N = 40, with error bars indicating an average over 1000 
random initial states. Note the ledge near p = 1, and that only for p = 1 does 
the algorithm succeed in finding the zero energy state. 



17 



4.6 Interpolating Model 



Given the rather different dynamic properties of the two models, it is natural to 
consider the behavior of models interpolating between them. The interpolating 
Hamiltonian is: 

E(p) = (1 - p )Etsp+pEd (14) 

where p is a parameter in (0,1). 

The average energy (in the combinatoric sense at infinite temperature) is 
independent of p. All other quantities, such as the energy of individual states 
and the degeneracy pattern are different from either of the models previously 
discussed. 

Depending on the value of p, the dynamics is expected to be more like one 
or other of the two lines shown in figure [3] One might hope for a transition to 
take place for some intermediate value of p, but in fact the dynamics reaches 
a plateau for any p < 1. Figure ITTl shows how the energy value of the plateau 
varies with p. For small N this curve is composed of a series of steps, ending in 
a horizontal ledge near p — 1. For larger N the curve becomes smooth and the 
width of the ledge shrinks, eventually the curve becomes a sawtooth. 

The question arises as to why stable local minima appear even for an in- 
finitesimal perturbation away from the displacement energy. Consider the state 
corresponding to the completely reversed permutation (with the central two ele- 
ments transposed in the case of N even). The displacement energy of this state 
is flat with respect to any of the local moves, though a series of moves will even- 
tually lead to a lower energy state. On the other hand the same state is part 
of a local minimum quartet with respect to the TSP energy. An infinitesimal 
addition of the TSP component is therefore sufficient to make the state a local 
minimum, albeit with infinitesimally small barriers. Although this argument 
correctly describes the reason that trapping can take place for p so close to 1, 
the identification of completely reversed states as being responsible is incorrect 
since their energy is considerably higher than the plateau. 

5 Conclusion 

Considering statistical mechanical models based on states that are permutations 
of numbers, we have demonstrated a relationship between two models with dis- 
tinct energy functions. Statically their partition functions are identical since 
the energy spectrum is the same. Dynamically there are substantial differences 
since one model has local energy minima and the other does not. This strange 
situation, that models with identical static properties have distinct local dy- 
namics, is not a paradox. In this case it is clearly due to the shuffling of states 
that modifies the energy landscape by rearranging the energies of states that 
are close to each other. 

The static analysis showed that the energy spectrum was the same by demon- 
strating a one-to-one map relating states with the same energy in each model. 
This map continues to hold for the more general model based on real numbers 
rather than integers. 

Dynamically we simulated a deep quench and showed that the energy de- 
cay proceeds in a completely different way in each of the two models. For the 
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displacement model, no dynamical lengthscale appears as the system is reorgan- 
ised. Characteristic timescales do appear and vary as N 2 , though with strong 
finite size effects. For the TSP model, there was rapid (timescale varying as 
N) decay to a plateau as a result of reorganisation over small spatial scales of 
size less than 10 units, that did not completely destroy the corelation with the 
initial state. Evolution continued to occur on the plateau, but was limited in 
range. This was interpreted in terms of trapping configurations with turning 
pairs. Flat movement within the trap was still possible between about 2 Ar / 3 
local minima states. 

Optimization problems of most interest, such as K-Satisfiability, are much 
richer than the models presented here. Nevertheless, in the context of the one 
parameter family of interpolating models, we have shown that our simple local 
search procedure exhibits a transition (at the very edge of the interpolation 
region) from being able to sort the numbers to becoming trapped by local min- 
ima. We hope that these models provide a simple arena for studying the general 
question of the role of the energy landscape in the performance of local search 
algorithms. 
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A Appendix: Model based on Real Numbers 



Instead of the integer model discussed in the text, this more general model 
is based on the set of N real numbers Xi, i = 1, 2 ... N, with each Xi in the 
range [0, N + 1], Without any loss of generality, we take the Xi's to be ordered 
(xi < Xj for all i < j) as this makes the identity permutation of the indices 
correspond to the lowest energy, or sorted state. The x^s should be regarded 
as quenched random variables, and for this reason the model might be regarded 
as a disordered model in contrast to the integer ordered model. The disorder 
however, is not of the independent variety that has been considered for both 
assignment and TSP problems using a replica approach in [81 117) . but rather 
corresponds to Euclidean distances in one dimension. Here, the replica approach 
becomes intractable since the average over disorder couples sites. 

For the disordered model, the definitions of the two energy functions remain 
exactly as given on the first lines of equations J5J and J3J , however the distance 
function is replaced by: 

d(P i ,P j ) = \x(P i )-x(P j )\ (15) 

Irrespective of the choice of x's, there are degeneracies in the spectrum, but 
not to the extent found in the ordered model. For example, for N = 4, the 
disordered model has energy levels with multiplicities (9,4,3,3,1,1,1,1,1). In the 
ordered model these combine to give (9,7,4,3,1). 

The mapping between the energy levels according to each model continues 
to hold, however there is a small difference in the moments. Here the moments 
are defined after additional averaging over the x's. The mean energy is the same 
as in the ordered model by design - indeed it was this criterion that selected the 
range of the Xi's to be [0, N + 1]. The second moment is however slightly larger 
than quoted in (HJ. 

(7V+l) 2 (3iV 2 + 10iV + 2) 



<W0 - <^» 2 > = * (^ + 2)180iv 2 ' < 16 > 

Even for individual instances of the disordered model, similar dynamical 
behaviour to that described in the text for the integer model is observed. After 
averaging over realizations of the x^s, the similarity becomes even closer. 



B Appendix: Multiplicities of Degeneracies 

The multiplicities are most easily obtained using the displacement energy Ed 
This may be evaluated for every permutation and D^(E) denotes the number 
of times that displacement E appears. For small N (up to N ks 16), it is 
straightforward to numerically enumerate these coefficients, and the first few 
are given in the table below. 
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N 


D(0) 


D(l) 


D{2) 


£>(3) 


D(4) 


D(5) 


D(6) 


D(7) 


D(8) 


D(9) 


1 


1 
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1 


























3 


1 


2 


3 























4 


1 


3 


7 


9 


4 

















5 


1 


4 


12 


24 


35 


24 


20 











6 


1 


5 


18 


46 


93 


137 


148 


136 


100 


36 



The usual arguments intended to lead to recurrence relations that would 
relate Dm+i{E) to sums of Djy(E') are not helpful in this case. Although 
there are some relations based on a sum over partitions of Eg, these are only 
valid in the region below the diagonal of the table. The problem appears to be 
non-trivial and indeed, according to Knuthp], the generating function does not 
appear to have a simple form. 

C Appendix: TSP Plateau Energy 

The average TSP energy is estimated after the effect of the Monte Carlo moves. 
It is convenient to evaluate the TSP energy by summing contributions from 
ascending and descending runs, rather than from adding each individual term 
which leads to many cancellations. The theory of runs of this kind is presented 
in |18| . but very little of the general development is necessary here. The main 
property we use is that their endpoints are characterized by turning points in 
the permutation sequence. We assume that N is large and only consider leading 
terms. 

1 N 1 

^sp) - 2N^< {lPi+1 ~ m * 2N {S){A) 

Where (s) is the average number of runs, and (A) is the average (absolute) 
change between the start and end of the run (A = d(P star t , Pend)) ■ The hrst 
approximation in this approach is to ignore the correlations between the number 
of runs and the value of A for that run in the formula above. 

To illustrate this form, let us use it to reproduce the average energy of a 
random configuration. In that case, (s) = 2N/3, since by considering sets of 
three adjacent points, the central one has a maximum or minimum value in 4 out 
of the 6 equally likely orderings. To deduce (A) we consider the average value 
taken at upper turning points separating runs. Since {max{P\ 1 P%, P3)) = 3iV/4 
and a symmetric result for the minimum, (A) = N/2. Combining these results 
in equation fTTjl reproduces (Etsp(N)) — N/6 as was derived in the text by 
considering contributions from all neighboring pairs. 

An estimate of the value of the average in the plateau is obtained by taking 
into account the effect of the Monte Carlo moves. We consider a subsequence 
of four points and look at the effect of a transposition on the central pair of the 
four, in all 24 possible orderings. The four points are labeled 1234, but this only 
signifies their relative magnitude. Table Q indicates whether the transposition 
causes a positive, negative or zero change in energy and lists the change in 
number of (internal) maxima or minima. 
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p 


AE 


As 


P 


AE 


As 


P 


AE 


As 


1234 


+ 


2 


1432 








4312 


+ 


1 


2134 


+ 


1 


1423 




-1 


3421 


+ 


1 


1324 




-2 


1342 








3412 


+ 





1243 


+ 


1 


4231 




-2 


4213 








2143 


+ 





3241 




-1 


4123 








3214 








4132 




-1 


2431 








3124 








3142 







2341 








2314 




-1 


4321 


+ 


2 


2413 








Table 1: Effect of transposing central pair of 4 points. AE indicates the sign 
of the energy change as a result of this move. As shows the change in number 
of (internal) turning points. 



Using this table, we find that moves that do not change the energy do not 
alter the number of runs. Moreover, there are six configuration that both reduce 
the energy and number of runs. These are the pair 1324, 4231, and the quartet 
2314, 1423, 3241, 4132. If we take their naive weights from the initial random 
configuration, then such transpositions for every pair of adjacent points leads 
to an average reduction in (s) of 2 x N x 2/24 + 1 x N x 4/24. So the final value 
after one Monte Carlo sweep is (s) = N/3. 

The argument for the average value at a maximum is extended from 3 (given 
in the example above) to 4 points, where it reproduces the 3 point case when all 
contributions are included. However, some of these contributions are removed 
by the Monte Carlo move (we only include those configurations with AE > 
in tabled) resulting in (A) = 3N/5. 

Overall we obtain the estimate (E TSP (N)) = 1/2N x N/3 x 3iV/5 = N/10, 
to be compared with the numerical value of E/N = 0.1092 ± 0.0001. The 
approximations in this approach are quite drastic. First we ignored correlations 
between the number of runs and the value of A for that run in formula (|17J) . 
Then we just looked at the effect of one Monte Carlo sweep, used naive weights 
and ignored any influence of one Monte Carlo move on another. It is therefore 
surprising that we obtain such a reasonable estimate. Indeed, numerical studies 
show that the estimates of both (s) and (A) are incorrect by about 10%. 
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